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Abstract 

We apply the Spectral Curvature Clustering (SCC) al- 
gorithm to a benchmark database of 155 motion sequences, 
and show that it outperforms all other state-of-the-art meth- 
ods. The average misclassification rate by SCC is 1.41% for 
sequences having two motions and 4.85% for three motions. 

Supp. webpage: http://www.math.umn.edu/~lerman/scc/ 

1. Introduction 

Multiframe motion segmentation is a very important yet 
challenging problem in computer vision. Given multiple 
image frames of a dynamic scene taken by a (possibly mov- 
ing) camera, the task is to segment the point correspon- 
dences in those views into different motions undertaken by 
the moving objects. A more formal definition of the prob- 
lem appears below. 

Problem 1. Consider a dynamic scene consisting of K 

rigid-body motions undertaken by K objects relative to a 
moving camera. Suppose that F frames of images have 
been taken by the camera, and that N feature points 
y 1 , . . . , y^r G M"* are detected on the objects. Let Zij E 
be the coordinates of the feature point yj in the image 
frame for every 1 < i < F and 1 < j < N, and form N 
trajectory vectors: Zj = [z[j . . . z'pj]' E WL^^. The 
task is to separate these trajectories Zi, . . . , zjv into inde- 
pendent motions undertaken by those objects. 

There has been significant research on this subject over 
the past few years (see [19, 16] for a comprehensive Uter- 
ature review). According to the assumption on the camera 
model, those algorithms can be divided into the following 
two categories: 

1. Affine methods [5, 15, 14, 17, 20, 9, 19, 3] assume 
an affine projection model, so that the trajectories 

*This work was supported by NSF grants #0612608 and #0915064. 



associated with each motion five in an affine sub- 
space of dimension at most three (or a linear sub- 
space of dimension at most four containing the affine 
subspace). Thus, the motion segmentation prob- 
lem is equivalent to a subspace clustering problem. 
State-of-the-art affine algorithms that have been ap- 
plied to this problem include Random Sample Con- 
sensus (RANSAC) [5, 15], Multi-Stage Learning 
(MSL) [14], Generalized Principal Component Anal- 
ysis (GPCA) [17, 9, 19], Local Subspace Affinity 
(LSA) [20], and Agglomerative Lossy Compression 
(ALC) [8, 11]. 

2. Perspective methods [7, 13, 18, 6, 12, 1] assume a per- 
spective projection model under which point trajecto- 
ries associated with each moving object lie on a mul- 
tilinear variety. However, clustering multilinear vari- 
eties is a challenging task and very hmited research 
has been done in this direction. 

An extensive benchmark for comparing the performance 
of these algorithms is the Hopkins 155 Database [16]. It 
contains 155 video sequences along with features extracted 
and tracked in all frames for each sequence, 120 of which 
have two motions and the rest (35 sequences) consist of 
three motions. 

In this paper we examine the performance of a recent 
affine method. Spectral Curvature Clustering (SCC) [3, 2], 
on the Hopkins 155 database and compare it with other 
affine algorithms that are mentioned above (their results 
have been reported in [16, 11] and also partly onUne at 
http://www.vision.jhu.edu/data/hopkinsl55/). 

Our experiments show that SCC outperforms all the 
above-mentioned affine algorithms on this benchmark 
dataset with an average classification error of 1.41% for 
two motions and 4.85% for three motions. In contrast, 
the smallest average misclassification rate among all other 
affine methods is 2.40% for sequences containing two mo- 
tions and 6.26% for sequences with three motions, both 
achieved by ALC [11]. 

The rest of the paper is organized as follows. We first 
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briefly review tiie SCC algorithm in Section 2, and then test 
in Section 3 the SCC algorithm against other common affine 
methods on the Hopkins 155 database. Finally, Section 4 
concludes with a brief discussion. 

2. Review of the SCC algorithm 

The SCC algorithm [3, Algorithm 2] takes as input a data 
set X = {xi, . . . ,xjv}, which is sampled from a mixture 
of affine subspaces in the Euclidean space and pos- 
sibly corrupted with noise and outliers. The number of 
the subspaces K and the maximum^ of their dimensions d 
should also be provided by the user The output of the al- 
gorithm is a partition of the data into K (disjoint) clusters, 
X = Ui<fc<7f Cfc' representing the affine subspaces. 

The initial step of the SCC algorithm is to randomly se- 
lect from the data c subsets of (distinct) points with a fixed 
size d+ 1. Based on these c {d+ l)-tuples, an affinity matrix 
Ac € R^^° is formed in the following way. Let Ji, . . . , Jc 
be the index sets of the c subsets. Then for each 1 < r < c 
and 1 < i < A/", if i e Jr, we set Ac(i, r) = by default; 
otherwise, we form the corresponding union I := [i Jj-] and 
define 

Ae(i,r):=e-pm/(2-^), (1) 

in which a > is a fixed constant whose automatic 
choice is explained later, and Cp(I) is the (squared) polar 
curvature [3] of the corresponding d + 2 points, xj := 
[• • -Xj • • -jiei. That is, 

Cp(I) :=m^l|xj -Xfcllj 

_ _J_ y det(x'i • XI + 1) 

Note that the numerator det(xj • xj + 1) is, up to a fac- 
tor, the (squared) volume of the {d + l)-simplex formed by 
the d + 2 points xj. Therefore, the polar curvature can be 
thought of as being the volume of the simplex, normalized 
at each vertex, averaged over the vertices, and then scaled 
by the diameter of the simplex. When d + 2 points are sam- 
pled from the same subspace, we expect the polar curvature 
to be close to zero and consequently the affinity close to 
one. On the other hand, when they are sampled from mixed 
subspaces, the polar curvature is expected to be large and 
the affinity close to zero. 

The SCC algorithm next forms pairwise weights W 
from the above multi-way affinities: 

W = Ae • K, (3) 

'By using only the maximal dimension we treat all the subspaces to be 
d-dimensional. This strategy works quite well in many cases, as demon- 
strated in [3]. 



and applies spectral clustering [10] to find K clusters 
Ci, . . . , Ck- 

In order to refine the clusters, SCC then re-samples c/K 
{d + l)-tuples from each of the clusters Ck,l < k < K, 
and re-applies the rest of the steps. This procedure is re- 
peated until convergence for a best segmentation, and is re- 
ferred to as iterative sampling (see [3, Sect. 3.1.1]). Its con- 
vergence is measured by the total orthogonal least squares 
(OLS) error of d-dimensional affine subspace approxima- 
tions Fi, . . . , Fk to the clusters Ci, . . . , Ck- 

K 

eoLS = E E (4) 

fc=lx6Cfc 

In situations where the ground truth labels of the data points 
are known, we also compute the misclassification rate: 

# of misclassified points . 
e% = — 100%. (5) 

The parameter a of Eq. (1) is automatically selected by 
SCC at each iteration in the following way. Let c denote the 
vector of aAxhe. {N — d — 1) ■ c squared polar curvatures 
computed in an arbitrarily fixed iteration. The algorithm ap- 
plies the following set of candidate values which represent 
several scales of the curvatures: 

{c{{N -d-l)-c/K'>)\q=l,...,d+l}, (6) 

and chooses the one for which the error of Eq. (4) is mini- 
mized. A quantitative derivation of the above selection cri- 
terion for fj appears in [3, Section 3.1.2]. It is also demon- 
strated in [3] that SCC will often fail with arbitrary choices 
of a. 

We present (a simplified version of) the SCC algorithm 
below (in Algorithm 1). We note that the storage require- 
ment of the algorithm is 0(A^ • {D + c)), and the total run- 
ning time is 0{n^ ■ {d + 1)^ ■ D ■ N ■ c), where is the 
number of sampling iterations performed (till convergence, 
typically 0{d)). 

3. Results 

We compare the SCC algorithm with other state-of-the- 
art affine methods, such as ALC [8, 11], GPCA [17, 9, 19], 
LSA [20], MSL [14], and RANSAC [5, 15], using the Hop- 
kins 155 benchmark [16]. We also compare the perfor- 
mance of affine methods with an oracle, the Reference al- 
gorithm (REF) [16], which fits subspaces using the ground 
truth clusters and re-assigns points to its nearest subspace. 
Though it cannot be used in practice, REF verifies the valid- 
ity of affine camera model and provides a basis for compari- 
son among practical algorithms. The results of the latter six 
methods (including REF) are already published in [19, 11], 
so we simply copy them from there. 




Figure 1. A sample image from each of the three categories in the Hopkinsl55 database. 



Algorithm 1 Spectral Curvature Clustering (SCC) 

Input: Data set X, maximal intrinsic dimension d, and 

number of subspaces K (required); number of sampled 

subsets c (default = 100 • K) 
Output: K disjoint clusters Ci, . . . , Gk- 
Steps: 

1: Sample randomly c subsets of X (with indices 
Ji, . . . , Jc), each containing d + 1 distinct points. 

2: For each sampled subset J^, compute the squared polar 
curvature of it and each of the remaining N — d — 1 
points in X by Eq. (2). Sort increasingly these {N — 
d — 1) ■ c squared curvatures into a vector c. 

3: for g = 1 to d + 1 do 

• Form the matrix Ac G M^^^ by setting = 
c{{N - d - 1) ■ c/K'i) in Eq. (1), and estimate 
the weights W via Eq. (3) 

• Apply spectral clustering [10] to these weights and 
find a partition of the data X into K clusters 

end for 

Record the partition Ci, . . . , Ck that has the smallest 
total OLS error, i.e., Cq^s of Eq. (4), for the correspond- 
ing K d-dimensional affine subspaces. 
4: Sample c/ K subsets of points (of size d + 1) from each 
Cfc found above and repeat Steps 2 and 3 to find K 
newer clusters. Iterate until convergence to obtain a 
best segmentation. 



The Hopkins 155 database contains sequences with two 
and three motions, and consists of three categories of mo- 
tions (see Figure 1 for a sample image in each category and 
Table 1 for some summary information of each category, 
e.g., number of sequences, average number of tracked fea- 
tures, and average number of frames): 

• Checkerboard: this category consists of 104 se- 
quences of indoor scenes taken with a handheld cam- 
era under controlled conditions. 



Table 1. Summary information of the Hopkins 155 database: num- 
ber of sequences (# Seq.), average number of feature points (TV), 
and average number of frames (F) in each category for two mo- 
tions and three motions separately. 
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• Traffic: this category consists of 38 sequences of out- 
door traffic scenes taken by a moving handheld cam- 
era. 

• Other (Articulated/Non-rigid): this category con- 
tains 13 sequences displaying motions constrained by 
joints, head and face motions, people walking, etc. 

It is proved (e.g., in [9]) that the trajectory vectors asso- 
ciated with each motion live in a distinct affine subspace of 
dimension d < 3 (or a linear subspace of dimension d < 4 
containing the affine subspace). Also, it is possible to clus- 
ter the trajectories either in the full space {F is the 
number of frames) or in some projected space (after dimen- 
sionality reduction by PCA), e.g., (K is the number 
of motions) or M''+^. Thus, we will apply the SCC algo- 
rithm (Algorithm 1) to each of the 155 motion sequences 
to segment d-dimensional subspaces in in six ways: 
= (3,4),(3,4X),(3,2F),(4,5),(4,4i^),(4,2F). 
Each case is correspondingly represented by the shorthand 
SCC {d,D). 

We use the default value c = 100 • if for all SCC (d, D) 
when applied to the 155 sequences. Also, in order to mit- 
igate the randomness effect due to initial sampling, we re- 
peat the experiment 100 times and record only the average 
misclassification rate. For each SCC (d, £)), we report in 
Table 2 the mean and median of the averaged errors for se- 
quences with two motions, and in Table 3 results on three 
motions. Figure 2 shows histograms of the misclassifica- 



Table 2. Misclassification rates for sequences with two motions. ALC 5 and ALC sp respectively represent ALC with projection dimensions 
5 and a sparsity-preserving dimension, LSA n means applying LSA in the projected space K" (after dimensionaUty reduction), and REF 
refers to the reference algorithm. 
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Table 3. Misclassification rates for sequences with three motions. ALC 5 and ALC sp respectively represent ALC with projection dimen- 
sions 5 and a sparsity-preserving dimension, LSA n means applying LSA in the projected space R" (after dimensionaUty reduction) and 
REF refers to the reference algorithm. 
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see (4, 2F) 
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tion rates with the percentage of sequences in which each 
algorithm achieved a certain error. The corresponding his- 
tograms for other methods are shown in [19, Figure 3]. 

4. Discussion 

Looking at Tables 2 and 3, we conclude that the SCC 
algorithm (with all six pairs {d, D)) outperforms all com- 
peting methods (in terms of the mean error) and is very 
close to the reference algorithm (REF). In the checker- 
board category, it even has a better performance than REF. 
In addition, SCC has the following two strengths in com- 



parison with most other affine methods. First, as we ob- 
served in experiments, the performance of SCC is not so 
sensitive to its free parameter c. In contrast, the ALC al- 
gorithm is very sensitive to its distortion parameter e and 
often gives incorrect number of clusters, requiring running 
it for many choices of e while having no theoretical guar- 
antee. Second, SCC can be directly applied to the orig- 
inal trajectory vectors (which are very high dimensional), 
thus preprocessing of the trajectories, i.e., dimensionality 
reduction, is not necessary (unhke GPCA and LSA). Fi- 
nally, we remark that SCC also outperforms some per- 
spective methods, e.g.. Local Linear Manifold Clustering 
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Figure 2. Histograms of misclassification errors obtained by SCC. 



(LLMC) [6] (their misclassification rates are also available 
at http://www.vision.jhu.edu/data/hopkinsl55/). 

The histograms (in Figure 2) show that the SCC al- 
gorithm obtains a perfect segmentation for 80% of two- 
motion sequences and for over 50% of three-motion se- 
quences. Under this criterion, SCC is at least comparable 
to the best algorithms (ALC, LSA -iK, MSL) and the ref- 
erence algorithm (REF); see [11, Figure 4] and [19, Figure 
3]. Moveover, SCC has the shortest tails; its worst case seg- 
mentation error (about 35%) is much smaller than those of 
other methods some of which are as large as 50%. 

Regarding running time, the SCC algorithm generally 
takes 1 to 2 seconds to process one sequence on a compute 
server with two dual core AMD Opteron 64-bit 280 proces- 
sors (2.4 GHz) and 8 GB of RAM. It is much faster than 
the best competitors such as ALC, LSA 4K, and MSL (see 
their computation time in [11, Table 6] and [16, Tables 3 & 
5] while also noting that there were all performed on faster 
machines). 

At the time of finalizing this version we have found out 
about the very recent affine method of Sparse Subspace 
Clustering (SSC) [4] which reportedly has superb results on 
the Hopkins 155 database and outperforms the results re- 
ported here for both SCC and REF. It will be interesting to 
test its sensitivity to its tuning parameter A in future work. 
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